---
title: "HuggingFaceLocalGenerator"
id: huggingfacelocalgenerator
slug: "/huggingfacelocalgenerator"
description: "`HuggingFaceLocalGenerator` provides an interface to generate text using a Hugging Face model that runs locally."
---

# HuggingFaceLocalGenerator

`HuggingFaceLocalGenerator` provides an interface to generate text using a Hugging Face model that runs locally.

<div className="key-value-table">

|  |  |
| --- | --- |
| **Most common position in a pipeline** | After a [`PromptBuilder`](../builders/promptbuilder.mdx)                                                          |
| **Mandatory init variables**           | `token`: The Hugging Face API token. Can be set with `HF_API_TOKEN` or `HF_TOKEN` env var.              |
| **Mandatory run variables**            | `prompt`: A string containing the prompt for the LLM                                                    |
| **Output variables**                   | `replies`: A list of strings with all the replies generated by the LLM                                  |
| **API reference**                      | [Generators](/reference/generators-api)                                                                        |
| **GitHub link**                        | https://github.com/deepset-ai/haystack/blob/main/haystack/components/generators/hugging_face_local.py |

</div>

## Overview

Keep in mind that if LLMs run locally, you may need a powerful machine to run them. This depends strongly on the model you select and its parameter count.

:::info Looking for chat completion?

This component is designed for text generation, not for chat. If you want to use Hugging Face LLMs for chat, consider using [`HuggingFaceLocalChatGenerator`](huggingfacelocalchatgenerator.mdx) instead.
:::

For remote files authorization, this component uses a `HF_API_TOKEN` environment variable by default. Otherwise, you can pass a Hugging Face API token at initialization with `token`:

```python
local_generator = HuggingFaceLocalGenerator(token=Secret.from_token("<your-api-key>"))
```

### Streaming

This Generator supports [streaming](guides-to-generators/choosing-the-right-generator.mdx#streaming-support) the tokens from the LLM directly in output. To do so, pass a function to the `streaming_callback` init parameter.

## Usage

### On its own

```python
from haystack.components.generators import HuggingFaceLocalGenerator

generator = HuggingFaceLocalGenerator(model="google/flan-t5-large",
                                      task="text2text-generation",
                                      generation_kwargs={
                                        "max_new_tokens": 100,
                                        "temperature": 0.9,
                                        })

generator.warm_up()
print(generator.run("Who is the best American actor?"))
## {'replies': ['john wayne']}
```

### In a Pipeline

```python
from haystack import Pipeline
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import HuggingFaceLocalGenerator
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack import Document

docstore = InMemoryDocumentStore()
docstore.write_documents([Document(content="Rome is the capital of Italy"), Document(content="Paris is the capital of France")])

generator = HuggingFaceLocalGenerator(model="google/flan-t5-large",
                                      task="text2text-generation",
                                      generation_kwargs={
                                        "max_new_tokens": 100,
                                        "temperature": 0.9,
                                        })

query = "What is the capital of France?"

template = """
Given the following information, answer the question.

Context:
{% for document in documents %}
    {{ document.content }}
{% endfor %}

Question: {{ query }}?
"""
pipe = Pipeline()

pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component("llm", generator)
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")

res=pipe.run({
    "prompt_builder": {
        "query": query
    },
    "retriever": {
        "query": query
    }
})

print(res)
```

## Additional References

🧑‍🍳 Cookbooks:

- [Use Zephyr 7B Beta with Hugging Face for RAG](https://haystack.deepset.ai/cookbook/zephyr-7b-beta-for-rag)
- [Information Extraction with Gorilla](https://haystack.deepset.ai/cookbook/information-extraction-gorilla)
- [RAG on the Oscars using Llama 3.1 models](https://haystack.deepset.ai/cookbook/llama3_rag)
- [Agentic RAG with Llama 3.2 3B](https://haystack.deepset.ai/cookbook/llama32_agentic_rag)
